28 research outputs found
Robust Block Coordinate Descent
In this paper we present a novel randomized block coordinate descent method
for the minimization of a convex composite objective function. The method uses
(approximate) partial second-order (curvature) information, so that the
algorithm performance is more robust when applied to highly nonseparable or ill
conditioned problems. We call the method Robust Coordinate Descent (RCD). At
each iteration of RCD, a block of coordinates is sampled randomly, a quadratic
model is formed about that block and the model is minimized
approximately/inexactly to determine the search direction. An inexpensive line
search is then employed to ensure a monotonic decrease in the objective
function and acceptance of large step sizes. We prove global convergence of the
RCD algorithm, and we also present several results on the local convergence of
RCD for strongly convex functions. Finally, we present numerical results on
large-scale problems to demonstrate the practical performance of the method.Comment: 23 pages, 6 figure
Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
Local graph clustering methods aim to detect small clusters in very large
graphs without the need to process the whole graph. They are fundamental and
scalable tools for a wide range of tasks such as local community detection,
node ranking and node embedding. While prior work on local graph clustering
mainly focuses on graphs without node attributes, modern real-world graph
datasets typically come with node attributes that provide valuable additional
information. We present a simple local graph clustering algorithm for graphs
with node attributes, based on the idea of diffusing mass locally in the graph
while accounting for both structural and attribute proximities. Using
high-dimensional concentration results, we provide statistical guarantees on
the performance of the algorithm for the recovery of a target cluster with a
single seed node. We give conditions under which a target cluster generated
from a fairly general contextual random graph model, which includes both the
stochastic block model and the planted cluster model as special cases, can be
fully recovered with bounded false positives. Empirically, we validate all
theoretical claims using synthetic data, and we show that incorporating node
attributes leads to superior local clustering performances using real-world
graph datasets.Comment: 30 pages, 2 figures, 9 table
Higher-order methods for large-scale optimization
There has been an increased interest in optimization for the analysis of large-scale
data sets which require gigabytes or terabytes of data to be stored. A variety of
applications originate from the fields of signal processing, machine learning and
statistics. Seven representative applications are described below.
- Magnetic Resonance Imaging (MRI): A medical imaging tool used to scan
the anatomy and the physiology of a body.
- Image inpainting: A technique for reconstructing degraded parts of an image.
- Image deblurring: Image processing tool for removing the blurriness of a
photo caused by natural phenomena, such as motion.
- Radar pulse reconstruction.
- Genome-Wide Association study (GWA): DNA comparison between two
groups of people (with/without a disease) in order to investigate factors
that a disease depends on.
- Recommendation systems: Classification of data (i.e., music or video) based
on user preferences.
- Data fitting: Sampled data are used to simulate the behaviour of observed
quantities. For example estimation of global temperature based on historic
data.
Large-scale problems impose restrictions on methods that have been so far
employed. The new methods have to be memory efficient and ideally, within
seconds they should offer noticeable progress towards a solution.
First-order methods meet some of these requirements. They avoid matrix factorizations,
they have low memory requirements, additionally, they sometimes
offer fast progress in the initial stages of optimization. Unfortunately, as demonstrated by numerical experiments in this thesis, first-order methods miss essential
information about the conditioning of the problems, which might result in slow
practical convergence. The main advantage of first-order methods which is to
rely only on simple gradient or coordinate updates becomes their essential weakness.
We do not think this inherent weakness of first-order methods can be remedied.
For this reason, the present thesis aims at the development and implementation
of inexpensive higher-order methods for large-scale problems
LASAGNE: Locality And Structure Aware Graph Node Embedding
In this work we propose Lasagne, a methodology to learn locality and
structure aware graph node embeddings in an unsupervised way. In particular, we
show that the performance of existing random-walk based approaches depends
strongly on the structural properties of the graph, e.g., the size of the
graph, whether the graph has a flat or upward-sloping Network Community Profile
(NCP), whether the graph is expander-like, whether the classes of interest are
more k-core-like or more peripheral, etc. For larger graphs with flat NCPs that
are strongly expander-like, existing methods lead to random walks that expand
rapidly, touching many dissimilar nodes, thereby leading to lower-quality
vector representations that are less useful for downstream tasks. Rather than
relying on global random walks or neighbors within fixed hop distances, Lasagne
exploits strongly local Approximate Personalized PageRank stationary
distributions to more precisely engineer local information into node
embeddings. This leads, in particular, to more meaningful and more useful
vector representations of nodes in poorly-structured graphs. We show that
Lasagne leads to significant improvement in downstream multi-label
classification for larger graphs with flat NCPs, that it is comparable for
smaller graphs with upward-sloping NCPs, and that is comparable to existing
methods for link prediction tasks